A Novel Robust Speech Feature Based on the Mellin Transform and Speaker Normalizatuin

نویسندگان

Jingdong CHEN

Bo XU

Taiyi HUANG

چکیده

A novel robust feature of speech signal has been proposed by us in [1]. The new feature is the modified Mellin transform of the log-spectra of speech signal and is short for MMTLS. Due to the scale invariance property of the modified Mellin transform, the MMTLS is insensitive to the vocal tract length of different speakers. Thus it is more appropriate for speakerindependent speech recognition than the widely used MFCC. In this paper, an improved MMTLS has been proposed. The experiments show that, the improved MMTLS outperforms the original MMTLS in the performance of speech recognition. For the comparison, the frequency warping (FWP) approach based speaker normalization is also investigated. Experiments show that the performance of the improved MMTLS-based speaker-independent recognizer is much better than that of the MFCC-based one even after the latter system is combined with a technique of speaker normalization.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A novel robust feature of speech signal based on the Mellin transform for speaker-independent speech recognition

This paper presents a novel kind of speech feature which is the modified Mellin transform of the log-spectrum of the speech signal (short for MMTLS). Because of the scale invariance property of the modified Mellin transform, the new feature is insensitive to the variation of the vocal tract length among individual speakers, and thus it is more appropriate for speaker-independent speech recognit...

متن کامل

شبکه عصبی پیچشی با پنجره‌های قابل تطبیق برای بازشناسی گفتار

Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...

متن کامل

A Wavelet Based Approach for Speaker Identification from Degraded Speech

This paper presents a robust speaker identification method from degraded speech signals. This method is based on the Mel-frequency cepstral coefficients (MFCCs) for feature extraction from the degraded speech signals and the wavelet transform of these signals. It is known that the MFCCs based speaker identification method is not robust enough in the presence of noise and telephone degradations....

متن کامل

A New Method for Speech Enhancement Based on Incoherent Model Learning in Wavelet Transform Domain

Quality of speech signal significantly reduces in the presence of environmental noise signals and leads to the imperfect performance of hearing aid devices, automatic speech recognition systems, and mobile phones. In this paper, the single channel speech enhancement of the corrupted signals by the additive noise signals is considered. A dictionary-based algorithm is proposed to train the speech...

متن کامل

Mel- and Mellin-cepstral Feature Extraction Algorithms for Face Recognition

In this article, an image feature extraction method based on two-dimensional (2D) Mellin cepstrum is introduced. The concept of one-dimensional (1D) mel-cepstrum that is widely used in speech recognition is extended to two-dimensions using both the ordinary 2D Fourier transform and the Mellin transform. The resultant feature matrices are applied to two different classifiers such as common matri...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2005

A Novel Robust Speech Feature Based on the Mellin Transform and Speaker Normalizatuin

نویسندگان

چکیده

منابع مشابه

A novel robust feature of speech signal based on the Mellin transform for speaker-independent speech recognition

شبکه عصبی پیچشی با پنجره‌های قابل تطبیق برای بازشناسی گفتار

A Wavelet Based Approach for Speaker Identification from Degraded Speech

A New Method for Speech Enhancement Based on Incoherent Model Learning in Wavelet Transform Domain

Mel- and Mellin-cepstral Feature Extraction Algorithms for Face Recognition

عنوان ژورنال:

اشتراک گذاری